Skip to content

OAK-12244: index nodes that gain a mixin rule, delete stale docs when…#2949

Open
thomasmueller wants to merge 3 commits into
trunkfrom
OAK-12244
Open

OAK-12244: index nodes that gain a mixin rule, delete stale docs when…#2949
thomasmueller wants to merge 3 commits into
trunkfrom
OAK-12244

Conversation

@thomasmueller

@thomasmueller thomasmueller commented Jun 11, 2026

Copy link
Copy Markdown
Member

… mixin rule is lost (#2938)

TODO: right now the PR does a descendent-document delete if the primary type or mixin is changed / removed. This is incorrect and needs to be fixed.

When an existing node's applicable indexing rule changes at runtime (e.g. jcr:mixinTypes added or removed), FulltextIndexEditor did not update the index because propertiesChanged was never set — jcr:mixinTypes is not normally listed in a rule's property definitions.

Track wasIndexable (rule matched before) alongside isIndexable() (rule matches after). In leave(), act on transitions:

  • !wasIndexable && isIndexable(): node gained a rule → addOrUpdate
  • wasIndexable && !isIndexable(): node lost a rule → deleteDocuments

Tests added:

  • PropertyIndexCommonTest: two end-to-end integration tests (all backends)
  • LuceneIndexEditor2Test: two unit tests verifying writer.docs / writer.deletedPaths

… mixin rule is lost (#2938)

When an existing node's applicable indexing rule changes at runtime (e.g. jcr:mixinTypes
added or removed), FulltextIndexEditor did not update the index because propertiesChanged
was never set — jcr:mixinTypes is not normally listed in a rule's property definitions.

Track wasIndexable (rule matched before) alongside isIndexable() (rule matches after).
In leave(), act on transitions:
- !wasIndexable && isIndexable(): node gained a rule → addOrUpdate
- wasIndexable && !isIndexable(): node lost a rule → deleteDocuments

Tests added:
- PropertyIndexCommonTest: two end-to-end integration tests (all backends)
- LuceneIndexEditor2Test: two unit tests verifying writer.docs / writer.deletedPaths
@sonarqubecloud

Copy link
Copy Markdown

bhabegger and others added 2 commits June 25, 2026 11:23
)

Root cause: when a node gains or loses a mixin type at runtime,
FulltextIndexEditor did not update the index because propertiesChanged
was never set — jcr:mixinTypes is not normally listed in a rule's
property definitions.

Fix: track wasIndexable (rule matched before) alongside isIndexable()
(rule matches after). In leave(), act on the indexing-rule transition:
- !wasIndexable && isIndexable(): node gained a rule → addOrUpdate
- wasIndexable && !isIndexable(): node lost a rule → deleteDocument

Split FulltextIndexWriter into two explicit operations:
- deleteDocumentTree(path): node physically removed; cascade is correct
- deleteDocument(path): node lost indexability at runtime; exact only

The original deleteDocuments used a PrefixQuery that cascaded to all
descendants; in the mixin-loss branch this was a bug — children carrying
their own mixin types were incorrectly evicted from the index.

Additional changes:
- Snapshot FT_OAK_12244_DISABLE once per commit cycle in FulltextIndexEditorContext
  as typeChangeTrackingEnabled so enter() and leave() always agree
- Skip getApplicableIndexingRule(before) on the hot path via hasNodeTypeChange
  guard when neither jcr:primaryType nor jcr:mixinTypes changed
- Register FT_OAK_12244 toggle in ElasticIndexProviderService
- Reuse CommitFailedException code 5 for the deleteDocument error path

Tests:
- PropertyIndexCommonTest: end-to-end integration tests (all backends)
- LuceneIndexEditor2Test: unit tests verifying writer.docs / writer.deletedPaths
- Verified: 1245 tests, 0 failures in oak-lucene
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown

Commit-Check ❌

Commit rejected by Commit-Check.                                  
                                                                  
  (c).-.(c)    (c).-.(c)    (c).-.(c)    (c).-.(c)    (c).-.(c)  
   / ._. \      / ._. \      / ._. \      / ._. \      / ._. \   
 __\( C )/__  __\( H )/__  __\( E )/__  __\( C )/__  __\( K )/__ 
(_.-/'-'\-._)(_.-/'-'\-._)(_.-/'-'\-._)(_.-/'-'\-._)(_.-/'-'\-._)
   || E ||      || R ||      || R ||      || O ||      || R ||   
 _.' '-' '._  _.' '-' '._  _.' '-' '._  _.' '-' '._  _.' '-' '._ 
(.-./`-´\.-.)(.-./`-´\.-.)(.-./`-´\.-.)(.-./`-´\.-.)(.-./`-´\.-.)
 `-´     `-´  `-´     `-´  `-´     `-´  `-´     `-´  `-´     `-´ 
                                                                  
Commit rejected.                                                  
                                                                  
Type message check failed ==> Configure GitHub workflows to use concurrency cancel-in-progress for
pull requests

see recommended best practices at Apache
https://cwiki.apache.org/confluence/pages/viewpage.action?spaceKey=INFRA&title=GitHub+Actions+Recommended+Practices

Signed-off-by: Aurélien Pupier <apupier@ibm.com> 
The commit message should follow Conventional Commits. See https://www.conventionalcommits.org
Suggest: Commit message does not match the required pattern 

--- Commit 3/29:
Type message check failed ==> Use Commit Check action version v2.7.0
(f237ed0085f49444ab5c85bdfa5cdcd490fc09c5)

The former version was not allow-listed by ASF in
https://github.com/apache/infrastructure-actions/blob/main/actions.yml 
The commit message should follow Conventional Commits. See https://www.conventionalcommits.org
Suggest: Commit message does not match the required pattern 

--- Commit 21/29:
Type message check failed ==> Revert "OAK-12281: Upgrade jackson-databind dependency to 2.20.2 (#2976)"

This reverts commit 1aee3adee83ef87f1424af00da73f01f215dd538. 
The commit message should follow Conventional Commits. See https://www.conventionalcommits.org
Suggest: Commit message does not match the required pattern 

--- Commit 22/29:
Type message check failed ==> oak-http: extract Authorization Field parsing and add test coverage (#2981) 
The commit message should follow Conventional Commits. See https://www.conventionalcommits.org
Suggest: Commit message does not match the required pattern 

--- Commit 28/29:
Type message check failed ==> Merge pull request #2984 from apache/OAK-12287

OAK-12287: Update to Apache Parent POM to version 39 
The commit message should follow Conventional Commits. See https://www.conventionalcommits.org
Suggest: Commit message does not match the required pattern

@fabriziofortino fabriziofortino left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic looks good to me. I just added a potential improvement.

Re feature toggles: the number of toggles are increasing. The code is getting more complex because of them (a lot of if/else). It's okay to have them in the short term, but I am concerned they won't be removed. I suggest adding a time-bombed test (eg https://github.com/apache/jackrabbit-oak/pull/2925/changes#diff-7f701488919abf1ac0a96ed15d558fc9615f2dc3a420f5ec58545fca43ba7990R38-R44) so that we won't forget to remove them.

Comment on lines +127 to +130
public boolean isTypeChangeTrackingEnabled() {
return typeChangeTrackingEnabled;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this and rollback all the changes in this class. The typeChangeTrackingEnabled is always based on the FT flag and isTypeChangeTrackingEnabled is only called in FulltextIndexEditor that defines the toggle. Since the toggle should be removed after a while, I propose rollback the changes in this class and explicitly check the FT flag instead of calling context.isTypeChangeTrackingEnabled().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants